-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Plotting Wrappers: Occupancy Histogram #403
Conversation
3fe51e8
to
2cc312a
Compare
This comment was marked as resolved.
This comment was marked as resolved.
415f620
to
e412849
Compare
@sfmig your comment on #5 indicates that it would be useful to have certain bits of information from the plot as outputs from this kind of function. Currently I'm just exposing the other But if so, we can also close #5 with this PR too. |
thanks for checking @willGraham01 ! The point of that comment was that often you not only want the figure, but also the 2D array with the bin counts. From your comment ...
seems like that is covered? So I think we can close #5 yay 😄 🚀 (Just fyi I vaguely remember this was something Sepi requested but not sure) |
I hope it is b/c otherwise I've just wasted 5 hours of Niko's grant 🤭 😂 But will mark #5 as closable by this 🥳 |
I will finish reviewing this tomorrow, but I can already do some cool things with this! See source code for this figureimport numpy as np
from matplotlib import pyplot as plt
from movement import sample_data
from movement.plots import plot_occupancy
# Load the sample dataset
ds = sample_data.fetch_dataset("DLC_two-mice.predictions.csv")
# Compute the centroid of all keypoints
centroid_position = ds.position.mean("keypoints")
image = plt.imread(ds.attrs["frame_path"])
# Construct bins of size 20x20 pixels that cover the entire image
bin_pix = 30
bins = [
np.arange(0, image.shape[0] + bin_pix, bin_pix),
np.arange(0, image.shape[1] + bin_pix, bin_pix),
]
# Initialize the figure and axis
fig, ax = plt.subplots()
# Show the image
ax.imshow(image)
# Plot the occupancy 2D histogram for each individual
_, _, hist_data = plot_occupancy(
da=centroid_position,
selection={"individuals": "individual1"},
ax=ax,
cmap="viridis",
alpha=0.5,
bins=bins,
cmin=3, # Set the minimum shown count
norm="log"
)
# Set the axis limits to match the image
ax.set_xlim(0, image.shape[1])
ax.set_ylim(image.shape[0], 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @willGraham01!
I’ve added some comments, mostly about aligning the function signature (and default behavior) with that of plot_trajectory()
.
Regarding your discussion with Sofía:
Yes, this approach technically meets the requirement of also obtaining the occupancy data as a 2D array, which is excellent. However, it can be slightly awkward to always rely on the plotting function when all you need is the occupancy array. There may be scenarios where the user only wants the 2D occupancy array—without the plot—for comparisons with neural data. From that perspective, it might be more intuitive to have a dedicated compute_occupancy
function that returns both the 2D array and the bin edges. We could discuss the best data structure to return—whether that’s an xr.DataArray
or multiple NumPy arrays, similar to hist2d
.
In any case, I suggest merging this PR with just plot_occupancy
(after addressing my comments) and leaving compute_occupancy
for a future PR. We just need to ensure that both functions produce consistent histogram data, i.e. compute_occupancy
should use the same underlying method as hist2d
.
e412849
to
8ecd914
Compare
Function signature is now in line with plot trajectory, and I've implemented the default behaviour of "take centroid of keypoints, then aggregate over individuals". Note that the aggregation over individuals works by "stacking" the input
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @willGraham01! I like your approach to aggregating, it's entirely sensible.
I left some final comments to be addressed. I'll pre-approve this PR so you have the freedom to merge it as soon as you're done addressing those comments.
movement/plots/occupancy.py
Outdated
if "individuals" in data.dims: | ||
data = data.stack( | ||
{"new": ("time", "individuals")}, create_index=False | ||
).rename({"new": "time"}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In some cases I get the following warning here:
/Users/nsirmpilatze/Code/NIU/movement/movement/plots/occupancy.py:157: UserWarning: rename 'new' to 'time' does not create an index anymore. Try using swap_dims instead or use set_index after rename to create an indexed coordinate.
).rename({"new": "time"})
Co-authored-by: Niko Sirmpilatze <[email protected]>
for more information, see https://pre-commit.ci
Description
What is this PR
Why is this PR needed?
See #388 and related, #5 (which is actually also closed)
What does this PR do?
Adds the
plot_occupancy
function to themovement.plots
module. This function takes in(time, space [x, y])
-data and produces a histogram showing the distribution of positions across all time-points.By default, any additional axes in the input
da
(DataArray
) are collapsed onto the 0th-index, to provide the expected 2D data input. Theselection
argument can be used by the user to specify alternative coordinates along non-spacetime dimensions to collapse onto instead.plot_occupancy
returns the usual figure and axes objects, however also returns information from the plotted histogram as its third value. This is mainly because this information is difficult to re-extract from the returnedaxes
figure. Thecounts
information in particular would technically otherwise be lost sinceQuadMesh
objects (that store histograms) only retain the colour-mapped values (which may blur across bins with similar, but distinct counts), and not the raw counts in each bin.References
Closes #388.
Additionally, this hopefully goes some way towards addressing #5, since we are returning the histogram data as the 3rd return value.Closes #5 too.
How has this PR been tested?
Addition of tests to cover expected functionality, and possible edge cases.
Is this a breaking change?
No
Does this PR require an update to the documentation?
#410
Checklist: